2 research outputs found

    A Machine Learning Approach to Reduce Dimensional Space in Large Datasets

    Get PDF
    Large datasets computing is a research problem as well as a huge challenge due to massive amounts of data that are mined and crunched in order to successfully analyze these massive datasets because they constitute a valuable source of information over different and cross-folded domains, and therefore it represents an irreplaceable opportunity. Hence, the increasing number of environments that use data-intensive computations need more complex calculations than the ones applied to grid-based infrastructures. In this way, this paper analyzes the most commonly used algorithms regarding to this complex problem of handling large datasets whose part of research efforts are focused on reducing dimensional space. Consequently, we present a novel machine learning method that reduces dimensional space in large datasets. This approach is carried out by developing different phases: merging all datasets as a huge one, performing the Extract, Transform and Load (ETL) process, applying the Principal Component Analysis (PCA) algorithm to machine learning techniques, and finally displaying the data results by means of dashboards. The major contribution in this paper is the development of a novel architecture divided into five phases that presents an hybrid method of machine learning for reducing dimensional space in large datasets. In order to verify the correctness of our proposal, we have presented a case study with a complex dataset, specifically an epileptic seizure recognition database. The experiments carried out are very promising since they present very encouraging results to be applied to a great number of different domains.This work was partially funded by Grant RTI2018-094283-B-C32, ECLIPSE-UA (Spanish Ministry of Education and Science), and in part by the Lucentia AGI Grant. This work was partially funded by GENDER-NET Plus Joint Call on Gender an UN Sustainable Development Goals (European Commission - Grant Agreement 741874), funded in Spain by “La Caixa” Foundation (ID 100010434) with code LCF/PR/DE18/52010001 to MTH
    corecore